Improving List Experiments

 

Gustavo Diaz
McMaster University
gustavodiaz.org

 

Slides: talks.gustavodiaz.org/tec

Research Agenda

Bias-variance tradeoff as darts

But the game of darts is more complicated

Two types of tradeoffs

  1. Explicit: Is a little bias worth the increase in precision?

  2. Implicit: Improving precision without sacrificing unbiasedness?

Two types of tradeoffs

  1. Explicit: Is a little bias worth the increase in precision?

  2. Implicit: Improving precision without sacrificing unbiasedness?

List Experiments

Example

List experiment

Here is a list of things that some people have done.

List experiment

Please listen to them and then tell me HOW MANY of them you have done in the past two years.

List experiment

Do not tell me which ones. Just tell me HOW MANY:

 

Control group

  1. Discussed politics with family or friends
  2. Cast a ballot for governor Phil Bryant
  3. Paid dues to a union
  4. Given money to a Tea Party candidate

List experiment

Do not tell me which ones. Just tell me HOW MANY:

 

Treatment group

  1. Discussed politics with family or friends
  2. Cast a ballot for governor Phil Bryant
  3. Paid dues to a union
  4. Given money to a Tea Party candidate
  5. Voted “YES” on the Personhood Initiative

Prevalence rate

\[ \text{Proportion(Voted yes)} =\\ \text{Mean(List with sensitive item)} -\\ \text{Mean(List without sensitive item)} \]

  • We get a prevalence rate estimate
  • But we do not know how individual respondents voted!

Compare with direct question

Did you vote YES or NO on the Personhood Initiative, which appeared on the November 2011 Mississippi General Election Ballot?

\[ \text{Proportion(Voted yes)} =\\ \text{Mean(Voted yes)} \]

Validation

Validation

Validation

Validation

Sensitivity bias reduction not always worth the increased variance

Can we do better?

Double list experiments

Example

List A

  • Californians for Disability (advocating for people with disabilities)
  • California National Organization for Women (advocating for women’s equality and empowerment)
  • American Family Association (advocating for pro-family values)
  • American Red Cross (humanitarian organization)

List B

  • American Legion (veterans service organization)
  • Equality California (gay and lesbian advocacy organization)
  • Tea Party Patriots (conservative group supporting lower taxes and limited government)
  • Salvation Army (charitable organization)

Sensitive item

Organization X (advocating for immigration reduction and measures against undocumented immigration)

  • Randomly appears in list A or B

  • Single list: Half of the respondents see sensitive item

  • Double list: Everyone sees it

  • Equivalent to two parallel list experiments

Three prevalence estimators

\[ \hat{\tau}_A = \text{Mean}(A_t) - \text{Mean}(A_c) \]

\[ \hat{\tau}_B = \text{Mean}(B_t) - \text{Mean}(B_c) \]

\[ \hat{\tau}_{Pooled} = (\hat{\tau}_A + \hat{\tau}_B)/2 \]

DLE yields more precise estimates

DLE yields more precise estimates

But variance reduction is not free

  • Baseline lists need to be comparable

  • Easiest way is to use paired items

  • American Family Association (A) \(\approx\) Tea Party Patriots (B)

  • BUT that makes it easier to spot the sensitive item

Different baseline estimates

Different baseline estimates

DLE variants

List order Sensitive item location
Fixed Fixed
Randomized Fixed
Fixed Randomized
Randomized Randomized

DLE variants

List order Sensitive item location
Fixed Fixed
Randomized Fixed
Fixed Randomized
Randomized Randomized
  • Fixed-fixed is not an admissible design

DLE variants

List order Sensitive item location
Fixed Fixed
Randomized Fixed
Fixed Randomized
Randomized Randomized
  • Fixed-fixed is not an admissible design
  • Randomized-fixed keeps sensitive item in second list

DLE variants

List order Sensitive item location
Fixed Fixed
Randomized Fixed
Fixed Randomized
Randomized Randomized
  • Fixed-fixed is not an admissible design
  • Randomized-fixed keeps sensitive item in second list

DLE variants

List order Sensitive item location
Fixed Fixed
Randomized Fixed
Fixed Randomized
Randomized Randomized
  • Fixed-fixed is not an admissible design
  • Randomized-fixed keeps sensitive item in second list
  • Fixed-randomized and randomized-randomized shuffle sensitive item order

Carryover design effects

Design effect(Blair and Imai 2012)

The inclusion of a sensitive item affects how survey participants respond to the baseline items within the list.

Carryover design

The inclusion of a sensitive item in one list affects how participants respond to the baseline items in the other list.

Toy example

Observed response List 1 List 2 Difference
Baseline 2 2 0
Deflation
Sensitive first 1 1 0
Sensitive second 2 1 1
Inflation
Sensitive first 3 3 0
Sensitive second 2 3 -1

Toy example

Observed response List 1 List 2 Difference
Baseline 2 2 0
Deflation
Sensitive first 1 1 0
Sensitive second 2 1 1
Inflation
Sensitive first 3 3 0
Sensitive second 2 3 -1

Toy example

Observed response List 1 List 2 Difference
Baseline 2 2 0
Deflation
Sensitive first 1 1 0
Sensitive second 2 1 1
Inflation
Sensitive first 3 3 0
Sensitive second 2 3 -1

Toy example

Observed response List 1 List 2 Difference
Baseline 2 2 0
Deflation
Sensitive first 1 1 0
Sensitive second 2 1 1
Inflation
Sensitive first 3 3 0
Sensitive second 2 3 -1

Toy example

Observed response List 1 List 2 Difference
Baseline 2 2 0
Deflation
Sensitive first 1 1 0
Sensitive second 2 1 1
Inflation
Sensitive first 3 3 0
Sensitive second 2 3 -1

Toy example

Observed response List 1 List 2 Difference
Baseline 2 2 0
Deflation
Sensitive first 1 1 0
Sensitive second 2 1 1
Inflation
Sensitive first 3 3 0
Sensitive second 2 3 -1

Toy example

Observed response List 1 List 2 Difference
Baseline 2 2 0
Deflation
Sensitive first 1 1 0
Sensitive second 2 1 1
Inflation
Sensitive first 3 3 0
Sensitive second 2 3 -1

Toy example

Observed response List 1 List 2 Difference
Baseline 2 2 0
Deflation
Sensitive first 1 1 0
Sensitive second 2 1 1
Inflation
Sensitive first 3 3 0
Sensitive second 2 3 -1

Why does this happen?

  • List experiment question format
  • Lists usually appear close to each other
  • Positively correlation across lists (Glynn 2013)

Statistical tests

  • Goal: Detect asymmetric shift across treatment schedules

  • Two tests:

  1. Difference-in-differences (stacked responses)

  2. Signed-rank test (paired responses)

Statistical tests

  • Goal: Detect asymmetric shift across treatment schedules

  • Two tests:

  1. Difference-in-differences (stacked responses)

  2. Signed-rank test (paired responses)